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Abstract 

Partitioning a graph into three pieces, with two of them large and con- 
nected, and the third a smaU "separator" set, is useful for improving the 
performance of a number of combinatorial algorithms. This is done using the 
second eigenvector of a matrix defined solely in terms of the incidence matrix, 
called the graph Laplacian. For sparse graphs, the eigenvector can be efficiently 
computed using the Lanczos algorithm. This graph partitioning algorithm is 
extended to provide a complete hierarchical subdivision of the graph. The 
method has been implemented and numerical results obtained both for simple 
test problems and for several grid graphs. 
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1 Introduction 



This report is a detailing of the research carried out by myself during the four months 
of Semester 1 (March- June) 1991, at the Department of Mathematics, University 
of Queensland, Australia. It was carried out under the supervision of Dr David E 
Stewart, and was accredited as a #30 project, subject classification code MN881. 

1.1 Origins and Acknowledgements 

This work was motivated by the publication of a paper pl| ], titled "Partitioning 
Sparse Matrices with Eigenvectors of Graphs", by Alex Pothen, Horst D. Simon 
and Kang-Pu Liou in the SIAM Journal on Matrix Analysis and Applications, 
September 1990, ll(3):430-452, and emulates some of the implementation con- 
tained therein. Their paper is mainly based on some work appearing in an earlier 
paper titled "A Property of Eigenvectors of Nonnegative Symmetric Matrices 
and its Application to Graph Theory" , by Miroslav Fiedler in the Czechoslovak 
Mathematical Journal, 1975, 25(100) :619-633, which provides a substantial amount 
of the theory quoted in §||. 

I implemented these results in C, interfacing with a library of C data-structures 
and functions called meschach j2^, written by my supervisor, Dr David Stewart. 
He is also responsible for selected pieces of code that I have used (see Appendix 
0), as well as large amounts of time spent educating me in UNIX, C, algorithms, 
numerical linear algebra and scientific writing. Proof reading was also done by my 
wife, Nerida Iwasiuk. 

1.2 Motivation and Applications 

The problem dealt with in this report is to partition the set of vertices N = 
{l,...,n} of an undirected, unvaluated graph G — {N,E) into 3 disjoint sets; a 
minimally small separator set S, and 3 "banks" A and B, of size of order n/2 each. 

Immediate applications of this partitioning apply the "divide-and-conquer" 
paradigm to a range of graph-theoretic problems such as the travelling sales repre- 
sentative problem. Industrial application problems, in particular those relating to 
the layout of components in VLSI design, are discussed in |l^. Other application 
problems are available in the Boeing-Harwell sparse matrix test problem library j^] , 
and include large-scale network- analysis problems such as power-grid distribution 
problems. 

An immediate application in numerical linear algebra is in the efficient solution 
of large sparse linear systems via factorisation and parallel solution of subproblems. 
Methods for doing this are outlined below. 
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Consider the solution of the n x n (sparse) symmetric linear system Cx = f . If 
a permutation of the indices of C can be made, such that in block form: 



PCP^x = 



Ha 



Hr mJ 



IB ly-LB 

Ma Mb Hg 





XA 




' f A ' 




XB 















then the system can often be solved much faster. Here Ha, Hb and Hs are symmet- 
ric, and Ma, Mb are usually not (they are usually not even square). The system can 
be solved using the LDL^ (Cholesky-type) factorisation for indefinite, symmetric 
matrices. Firstly, be aware that simple Cholesky factorisation LL^ is only appli- 
cable to positive definite matrices, and problems dealt with are not guaranteed to 
be positive definite. (Note that most of the experimental graphs used in this report 
are 5-point grid graphs, which are always positive semi-definite.) Whilst the graph 
Laplacian matrices mentioned here are singular, the C matrices actually involved 
are not. Writing PCP'^ = LDL^ in block form gives: 



Ha 



Ma 



Hb 
Mb 



Ml 
Hs 



La . 
. Lb . 

Na Nb Ls 



D 



YA 




U 






iB 


ys 




fs 



where La, Lb are lower triangular, Ls is symmetric but not generally triangular, 
and Na and Nb are generally sparse \S\ x \A\ and \S\ x \B\ matrices. In a similar 
procedure to the usual Cholesky procedure, the solution of the system is done in 
stages (3, not 2, as there is also a D). The first stage in the solution of L{DL^x) = f 
is the solution for y = DL'^x of Ly = f by: 

La ■ 

■ Lb . 
Na Nb Ls 

To solve this block system, it can be written as: 

LaYa = fA 

LbYb = fs 

Lsys = fs - Nay A - NbYb- 

The third equation is a (dense) order \S\ system. If l^] is small, it can be solved at 
no great expense, using simple Gaussian elimination. This solution depends on the 
prior solution of the two earlier systems for ya and ys. These are sparse, linear, 
lower triangular systems of size roughly n/2, and are cheap to solve using forward 
substitution. 

The second stage is the (trivial) solution of D{L^x) = y, using L^x = z and 
Dz = y; which is z = D~^y, as Z> is a diagonal matrix. The third stage is similar 
to the first stage, and provides a solution of L^x = z, which is the solution to the 
whole problem Cx = f : 



Ni 





XA 




ZA 




XB 
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Similarly to the first stage, we solve first for X5 the order |5'| dense system Ljxs = 
Z5, then substitute this into: 

L^x^ = ZA~ nJxs 

to obtain and x^ cheaply, by backward substitution. 

An alternative to the LDL^ factorisation is the direct solution of {PCP^)x = f . 
This method involves recursively partitioning the large sparse matrix C, and is called 
"Nested Dissection" We write out the equations in block form: 



Haxa = fA - X5 

Hsxs = fs - Ma^x-a - Mb^b- 

"Solving" the first 2 equations for Xyi and x^, then substituting the results into 
the third yields an order \S\ dense system, which gives xg as the solution to: 

MaH^Ha - Afjxs + MbH^Hb - Ml^s + Hs^s = fs- 
Explicitly: 

X5 = [MaH^^mJ + MbH^HiI + Hsy\fs - MaH^Ha - MbH^Hb). 

Note that whilst the graph Laplacian is singular, the original C matrix is generally 
non-singular, so there is no problem writing this explicitly. After the solution for 
xg, we have x^ and x^ as the solution to the following order \A\ and \B\ sparse 
systems, explicitly written as: 

XA = i/^i(fA-Mjxs) 
XB = Hs\iB-Ml^s)- 

If 1^1 K, \B\ K, n/2, and \S\ is small, the computational expense is minimised. 
This situation is amenable to implementation on parallel processing machines, since 
the 2 linear systems are decoupled. If these are too large to solve simply, the de- 
composition of C can be repeated on Ha and Hb- The factorisation process can be 
recursively continued until units of a desired "atomic" size are obtained - perhaps 
somewhere between 3 and 10. A more thorough analysis could be performed to de- 
cide the optimal choice, such that overall computational expense is minimised. This 
decomposition has been performed for several example graphs in 

In this report, the graphs considered are sparse (the average degree of the vertices 

is low), and large (at current computational capabilities, n is to 10^). m 

illustrate graphs with n up to 32400 (for a 180 x 180 5-point grid, average degree 



4) . An example of a 5-point grid graph is provided in § |6.5| . (9-point grid graphs are 
similar, except that each vertex is connected to its nearest 8 neighbours, rather than 
its nearest 4.) Larger scale applications are available in the Boeing-Harwell sparse 
matrix test problem library p|, which quotes examples of size up to 44609. 
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1.3 Broad Outline of the Report 



The algorithm for partitioning graphs is dependent on analysis of some of the prop- 
erties of "M-Matrices" , of which the graph Laplacian is an instance. §2.1 describes 
these entities and some of their properties. The graph Laplacian is introduced, and 

shown to be an M-Matrix in §2.2. 

An amazing result (of Fiedler, 1975 1^), presented in Algorithm 2T on page [T^, 
relates the discrete mathematics of the interconnections of the graph to the continu- 
ous mathematics of the eigendecomposition of its "graph Laplacian" (defined in §2.2 
in terms of the incidence matrix of the graph) . This result takes the components of 
the second eigenvector of the graph Laplacian matrix, and uses them in a valuation 
of the vertices of the graph. An edge separator set of the graph is then computed, 
such that the vertices are partitioned by this set into 2 connected banks of very 
similar size. Using this edge separator set, a combinatorial algorithm creates the 3 



vertex sets described in §1.2. This algorithm is fairly efficient, and has a number of 



immediate applications. Many large tasks can be simplified by using this algorithm 
in a "divide-and-conquer" approach, which can greatly improve computational cfh- 
ciency. The Lanczos algorithm (to determine eigenvalues of large, sparse matrices) 
is implemented as a part of the overall algorithm, and is discussed in j |2.6| . 

This report firstly details experiments performed in synthesising the results of 
pl[ , who used the above strategy in computing separator sets of sparse graphs of 
order 10^ to 10^ vertices. Secondly, it demonstrates the recursive decomposition 
of a graph down to atomic-unit sized subgraphs. This appears to be a new (but 
probably obvious) implementation. 



1.4 Contents of the Report 

is the largest part of this report, and contains seven subsections. It describes 
the theory behind the entire graph partitioning process. §^ describes the implemen- 
tation in C of the algorithms in §^ and references the source code in Appendix 
. §^ mentions the success of the code in dealing with a number of test examples. 
Results are compared, where possible to literature and touchstone cases (in par- 
ticular, partitioning 5-point grid graphs). §^ also describes problems encountered 
in programming and implementation. §y outlines the directions in which the work 
could be continued, to make the code more usef ul. T he decomposition of a small 
example graph (Moshe) is presented in Appendix |6.2| . 
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2 Theory 



This section has seven subsections: 

1. M-Matrices, their origins, applications, definitions, and some interesting the- 
orems. 

2. Graph Laplacian, its definition, and properties. 

3. Graph Partitioning, using the second eigenvalue of the graph Laplacian 

4. Spectral Partitioning Algorithm, developed from the graph partitioning result 

5. Minimum Cover - algorithms involved in finding the minimum vertex cover 
of a bipartite graph. 

6. Lanczos Algorithm - required for the implementation of the spectral par- 
titioning algorithm on sparse matrices. 

7. Recursive Decomposition - recursive implementation of the partitioning pro- 
cess. 

In this report, G — {N, E) refers to a graph, where N ~ {1, . . . is the set of 
vertices. A always refers to a real, square matrix of size n. This is a sufficient, but 
not necessary requirement for some of the definitions and theorems in j ^2.l| . In later 
sections, specific theorems that also depend on A being symmetric are quoted. 

2.1 M-Matrices 

M-Matrices are a subclass of real matrices that are closely related to non-negative 
matrices JTs], and have a number of interesting properties (for instance, see Theorem 



2.5). They arise in several specific fields, e.g.: 



1. The discretisation of boundary- value partial differential equations (both sym- 
metric and non-symmetric) generates matrices which are the negative of M- 
Matrices. Several examples of the discretisation of Poisson's Equation (viz 
y'^u{x,y) = f{x,y)) on rectangular 5-point grids are explicitly demonstrated 
in §^ of this report. See also §2.2. 



2. Continuous-Time Markov Processes. The differential equations for probabili- 
ties are usually (singular) M-Matrices. 

3. Economics. 

M-Matrices were introduced in 1937 in p7| , and their properties have been inves- 
tigated by a number of researchers. Material presented in this section is abstracted 
from several sources |[ |l^, |9|. M-Matrices have a multiplicity of possible def- 
initions (e.g. Q lists 50 definitions for nonsingular M-Matrices). This allows great 
flexibility in proving results which involve them. The following definition of an M- 



Matrix is quoted from |15 
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Definition 2.1 A is an M-Matrix if there is a non-negative matrix B E M"^" 
with maximal eigenvalue r, and c ^ r such that A — cin — B. 

The main diagonal entries of an M-Matrix are non-negative and all of its other 
entries are non-positive. Fiedler ||^ calls an M-Matrix a matrix of class Kq, and a 
nonsingular M-Matrix a matrix of class K. 

Definition 2.2 i^„ = {A \ ^0, i 7^ j}. 

That is, Fn is the set of real matrices whose off-diagonal elements are non- 
positive. The following results are some necessary and sufficient conditions for an 
element of Fn to be an M-Matrix, and are provided mainly as an introduction to 
some of the properties of M-Matrices. The first is a sufficient condition: 

Theorem 2.1 A E Fn is an M-Matrix iff all its eigenvalues have a non-negative 
real part. 

The second is a necessary condition: 
Theorem 2.2 A E Fn is an M-Matrix iff every real eigenvalue of A is non-negative. 

Definition 2.3 Given M as a non-empty subset of N, define the principal sub- 
matrix of A corresponding to M as A{M) = {aij\i,j S M}. 

Theorem 2.3 A principal submatrix of an M-Matrix is an M-Matrix. 



Theorem 2.4 A G Fn is an M-Matrix iff all its principal minors are non-negative. 

The next result is important, and is mentioned in 
Theorem 2.5 A G Fn is a nonsingular M-Matrix iff A^^ is non-negative. 

Another definition of Af -matrices is: 

Definition 2.4 A is an M-Matrix iff all its off-diagonal entries are ^ 0, and 
all its principal minors are non-negative. It is a nonsingular M-Matrix if its 
principal minors are positive. 

Theorems 2.6 and 2.7 are proven in |Q and p^ . 

Theorem 2.6 A nonsingular M-Matrix has A^^ non-negative, and if it is irre- 
ducible, A^^ is positive (that is, none of its entries are zero). 

Theorem 2.7 If A is an irreducible, singular M-Matrix, then: 

1. is a simple eigenvalue of A. 

2. There is, up to a scale factor, a unique non-zero eigenvector u € R" with 
eigenvalue zero, and all the components o/u are non-positive or non-negative. 



Theorem p.8| applies to symmetric matrices, and follows from Definition |2.4 
and the properties of positive definite (or semidefinite) matrices. 

Theorem 2.8 A symmetric matrix is an M-Matrix if all its off- diagonal entries 
are non-positive and all its eigenvalues non-negative. If its eigenvalues are positive, 
then it is a non-singular M-Matrix. 
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2.2 Graph Laplacian 

This section introduces the graph Laplacian, and defines it in a number ol ways. 

Deflnition 2.5 Given a graph G = {N,E), define the degree of vertex i as di, the 
number of edges in E with one end being vertex i. 

Definition 2.6 Given a graph G = {N,E), the graph Laplacian Q «s the 

matrix L(G) (here G is the incidence matrix of the graph) of the quadratic form: 

■x^L{G)^ ^ iL{G)x, x) = ^ (x, - Xjf. 

Thus L{G) = (iy), where kj = Iji = ■{ -1 ij^j, (ij) E E 

di i = j. 

The structure of the graph Laplacian is easier to see in another way. Define 
M e K"^" as the adjacency matrix of G, and D = diag(di, . . . , d„) G M"""". 
The graph Laplacian is then L{G) — D ~ M . This definition means that L{G) is 
immediately seen as being a matrix with —1 replacing 1 in M, and a diagonal whose 
ith element is the number of non-zero off-diagonal elements in row or column i of 
M. This is illustrated by Figures |^ and ^, and is the form used in implementation. 

As the row and column sums of the graph Laplacian are zero, e = [1, . . . , 1]^ G 
R" is an eigenvector corresponding to the eigenvalue 0. As it has a zero eige nval ue, 
the graph Laplacian is singular. As it satisfies the requirements of Theorem |2.8| , it 
is an M-Matrix. Hence, the graph Laplacian is a singular M-Matrix, and is able to 



be used in some of the theorems in §2.3 



Experimental graphs considered in this report include several 5-point grid graphs 
associated with the discretisation of elliptic Boundary- Value PDEs on rectangular 
regions. If the rectangle is reduced to an m x n grid, the graph Laplacian is an 
mn X mn sparse matrix, of very definite structure. Consider Poisson's Equation 
in two dimensions, with boundary conditions for u{x, y) on a rectangle: 



(It is Laplace's Equation if f{x, y) = 0.) Discretisation on a regular m x n grid, 
using the notation /y = f{xi,yj), and Uij = u{xi,yj) gives: 



Ui+i,j -t- Ui-i J -I- Uij-i + "ij+i - 4M,y = h^f, 



2 = 2, . . . , m — 1 
j = 2,...,n-l. 



This generates an mn x mn linear system Cu = f on a 5-point grid graph G. 
The graph Laplacian L{G) is an m x m block-tridiagonal matrix, where each block 
is a symmetric n x n matrix. 
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L(G) = 



P In . 

T P T 

J P J 



where: 



P = 



T P J 

. /„ P 



-41.. 
1-41. 
1 -4 1 



1 -4 1 
1 -4 

The st ruct ure of C is closely related to this. A specific example is presented in 
Appendix 6.5. For 9-point grids or non-elliptic PDEs, the structures are different, 
but the solution techniques are essentially the same. 



2.3 Graph Partitioning 

This section outlines the work of and theorems listed are taken directly from it. 

Definition 2.7 Given an undirected, unvaluated graph on n vertices, G = {N,E), 
a vertex separator set is a subset M of N , such that removal of the vertices in 
M from G, together with all edges in E containing them, will disconnect the graph. 



For the purposes of this report, consider only graphs that are connected. We are 
interested in vertex separator sets that disconnect the graph into 2 subgraphs with 
approximately equal numbers of vertices. We are particularly concerned with sparse 
graphs and small separator sets; but we want algorithms that will automatically 
perform adequate partitioning of any graph. 

Ideally, a partitioning algorithm should find a separator set S of absolute mini- 
mal cardinality, but this appears to be a combinatorially explosive problem (prob- 
ably NP-complete) , and is regarded as infeasible. Instead, we attempt to find an S 
that is "reasonably" small, at least such that IS*! <C n, and partitions N into two 
sets of about the same size. 
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It is difficult to quantify good separators |Tj]. The ideal choice is determined 
by the particular problem, and it is always possible that a "small" separator set 
does not exist The extreme example is the completely connected graph, where a 
separator set is of order of the same size as the original set. For planar graphs, the 
minimum size of a separator set is always ^ \/8n, and the resulting partition has 2 
sides each with no more than 2/3 of the total number of vertices |Q. This fact may 
be of some interest, as many of the large practical problems are either planar, or very 
nearly so (see ||, ^ |2^)- Liu, 1989 p3 describes the notion of "good" separator 
sets in some detail, and Pothen et al. pl| provide several numerical bounds on the 
minimum possible sizes of separator sets for a given graph. Many of the bounds 
may be grossly overestimated, and, as they involve calculation of several of the 
eigenvalues of the graph Laplacian (an expensive and error-prone procedure), are 
not particularly useful. 

Methods presented in the literature can be found in the references to [|, 11, 1^, 



|l|,|l|, and are not reiterated here. This section deals with a new process, attributed 
to [ 21 1 . The formal construction of this process depends on theory attributed to ||] , 
which begins with a number of relevant definitions. 

Definition 2.8 diameter(G') = max {minimum path length 

That is, the diameter of a graph G = {N, E) is the maximum value of the 
minimum length path between any two vertices. 

Definition 2.9 A decomposition of a graph G ~ {N, E) is a disjoint vertex cover 
N = NiU N2, such that Ni,N2^9 and A^i n 7V2 = 0- 

Definition 2.10 A is called irreducible if, for no decomposition of N = Ni U 

N2, a,j = 0, ie Ni,j e N2. 



This corresponds to connectedness of graphs. Using Definition 2.10 for symmetric 
A e M"''", define: 

Definition 2.11 A is of degree of reducibility d E [0,n — 1] if there exists a 

d+l 

decomposition of N into d+l non-empty disjoint subsets N — \^ Ni such that: 

1. A{Ni) are irreducible, i — 1, . . . , d + I. 

2. Qpq ^0, pe N„qe Nj,i ^ j. 

An irreducible symmetric matrix has d = Q. Thus, a graph has d = if it is 
connected. 

Definition 2.12 The n eigenvalues of A (some are possibly multiple) are ordered 
in increasing size: Ai ^ A2 ^ . . . ^ A„. 

Similarly, the corresponding eigenvectors are ordered - that is, the ith eigen- 
vector of A is the eigenvector corresponding to the ith smallest eigenvalue. For the 
purposes of the Graph Partitioning theorem, it will turn out that all eigenvalues 
are non-negative. The smallest will always be 0, of multiplicity equal to the number 
of connected components of the graph (or degree of reducibility). For 5-point grid 
graphs, the largest eigenvalue, by Theorem 2.£ p341], will always be ^ 8. 
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Theorem 2.9 (Gershgorin Circle Theorem) Given a matrix X ^AX E C"^", 
such that X^^AX = diag (di, . . . , dn) + F, where F has zero diagonal entries, then: 

n I n 

\{A) C IJ A, where A = < ^ e C such that \\z - d,\\ sC ^ \f^j\ 
i=i I j=i 



That is, the eigenvalues of a complex matrix lie within the union of a set of 
n closed circles in the complex plane. The centre of each circle is at the point 
corresponding to a diagonal element, and its radius is the sum of the absolute 
values of the non-diagonal elements of that row. Note that it is usual to use X = I. 
As applied to A, the real graph Laplacian of a grid graph, where: 

1. The sums of the absolute values of the non-diagonal entries are exactly equal 
to the diagonal elements. 

2. The diagonal elements of A are all of value cither 2, 3 or 4. 

3. All the eigenvalues of A are real (as A is real and symmetric). 

we obtain \{A) C [0, 8]. More generally, if ^ is a graph Laplacian, X{A) C [0, 2A], 
where A — max di . 



From Theorems 2.7 and 



Theorem 2.10 // A is symmetric, irreducible, has non-negative off-diagonal en- 
tries, and Az — for some real n-vector z, which is neither zero, positive nor 
negative; then A is not positive semidefinite. 



Theorem 2.11 is a corollary to a more general result in 



Theorem 2.11 If A is non-negative, irreducible and symmetric, and G M" is 
the dth eigenvector of A, d^ 2; then M = {i G N\yi ^ 0} is non-null and the degree 
of reducibility of A{M) ^ d — 2. 

This means that, choosing d = 2 and v as the second eigenvector of A] the degree 
of reducibility oi A{M) ^ 0, which means that it is 0. Using the second eigenvector 
of A as V thus yields an irreducible (connected) component. 

Fiedler's paper |6| goes beyond the needs of this report in defining the graph 
Laplacian and associated results for the case of valuated (but not directed) graphs. 
The following graph-theoretic results are a simplification of those presented in Sec- 
tion 3 of Q. 

Definition 2.13 A cut C of a graph G is a set of edges to which a decomposition 
N = Ni U N2, where Ni n N2 = 0, exists, such that C consists exactly of all edges 
in G with one vertex in each set of the decomposition. The subgraphs of G induced 
by the subsets Ni and N2 are called banks. 



Theorem 2.12 (Unique Decomposition of Connected Banks) // there is a 
decomposition N = iVi U N2 of a graph G, corresponding to a cut C , such that both 
corresponding banks are connected, then the decomposition of N corresponding to C 
is unique. 
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Definition 2.14 The algebraic connectivity ^ of L{G) is defined as the small- 
est non-zero eigenvalue of G. Corresponding to this is the characteristic valua- 
tion, which is an assignment of the elements of the eigenvector corresponding to 
the this eigenvalue of L{G). 



As mentioned in §2.2, the smallest eigenvalue is always 0, of multiplicity equal 
to the number of components (connected units) of G. For connected graphs, the 
algebraic connectivity and characteristic valuation are equal to the second eigenvalue 
A2 and eigenvector V2, respectively. 

Theorem 2.13 For any real r, define M{r) = {i G N\yi ^ — r}. The subgraph G{r) 
induced by G on M{r) is connected. 



Theorem 2.14| is the fundamental result required by the graph partitioning al- 
gorithm. 

Theorem 2.14 (Main Graph-Partitioning Theorem) // there exists a real c 
such that ^ c < maxy^ and c ^ yi Vi, then the set of edges {i,j) of G for which 

i 

yi < c < yj forms a cut G of G. If Ni = {j e N\y-j > c} and N2 = {j e N\yj < c}, 
then N = NiU N2 is a decomposition of N corresponding to G, and the bank G{N2) 
is connected. 



Theorem ^.12 shows that the decomposition and the banks are uniquely deter- 



mined. Define Ni ^ {i e N\yi > 0} and N2 ^ {i e N\yi < 0}, then iV = iVi U iV2 



is the decomposition corresponding to G. Theorem 2.15, shows that all cuts with 
connected banks in a connected graph are able to be obtained (via the second 
eigenvector of the graph Laplacian). 

Theorem 2.15 If G is a connected graph with a cut C such that both banks of 
G are connected then there is a positive valuation of the edges of G such that the 
corresponding characteristic valuation y is unique (up to a factor), and yi ^ V«, 
and G is formed exactly by alternating edges of the valuation y. 

In summary, the essential results in Q are contained in Algorithm |2.l| (that I 



have composed), which is directly referred to in the beginning of §2.4 
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Algorithm 2.1 (Edge Separator Algorithm) Given an graph G = {N,E), cal- 
culate a edge separator set (cut) Ei G E such that the 2 resulting banks are connected 
and have approximately equal numbers of vertices. 



1. Galculate the graph Laplacian L{G) ofG, and the smallest non-zero eigenvalue 
of G (the algebraic connectivity). If G is connected, the eigenvalue will be 
of multiplicity 1, in which case, the algebraic connectivity is equal to X2, the 
second eigenvalue of L{G). 

2. Calculate the corresponding (second) eigenvector (the characteristic valua- 
tion). 

3. Assign to the vertices of G the n elements of the characteristic valuation. 

4. Find the set of edges Ei C E, whose characteristic valuations cross the median 
value of the components of the second eigenvector. 

5. El is the cut required edge separator set. Choice of Ei from edges whose ver- 
tices cross some point between two other components of the characteristic val- 
uation will also yield 2 connected banks, but their sizes will not be nearly equal. 



2.4 Spectral Partitioning Algorithm 



The idea of using the results contained in Algorithm 2.1 in an algorithm to partition 
graphs into vertex sets appears in Pothen et al., 1990 |2^. They describe their 
algorithm as a "Spectral Partitioning Algorithm" , and this convention is followed 
here. They compare the performance of this algorithm with several other algorithms: 

1. Kernighan-Lin Algorithm - a modified level structure algorithm that is im- 
plemented in SPARSPAK0. 

2. Fiduccia-Mattheyes Algorithm, implemented by Leiserson and Lewis [ p^ . 

3. Separator Algorithm of Liu ||lj], based on the Multiple Minimum Degree Al- 
gorithm. 

This report is an emulation of their algorithm, formally described in Algorithm 
2.2, largely taken from |^. This procedure partitions the set of vertices of a (sparse) 
graph, in the form specified in §|l|. Firstly, it applies the results in Algorithm 2.1 to 
the graph concerned, to yield Ei, an appropriate edge separator set, and A' and 
B' , sets of vertices on either side of this set. Secondly, a combinatorial procedure 
chooses from the vertices adjacent to this edge separator set, a vertex-separator set 
S, and defines the corresponding vertex sets of the banks A and B. 

Before listing the actual algorithm, the definition of a minimum cover is required. 

Definition 2.15 Given a graph G = {N,E), a (vertex) cover is a set of vertices 
S, such that every edge in E has at least one of its endpoints in S . 



^SPARSPAK is a package of sparse matrix routines, available through the netlib electronic software 
library 
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Algorithm 2.2 (Spectral Partitioning Algorithm) Given the sparse matrix 
of a graph G — {N,E), find a partition of the vertex set N — A (J B U S , 
such that \A\ « \B\ « n/2, and \S\ is "small", in a restricted sense. 

1. Compute the eigenvector X2 and the median value Xm of its components. 

2. Partition the vertices of G into 2 sets, A' = {v ^ N \ Xy ^ and B' = 
N\A'. If\A'\ — \B'\ > 1, move enough vertices with components equal to Xm 
from A' to B' to make this difference at most one. 

3. Define Ai as the set of vertices in A' adjacent to some vertex in B' , and Bi 
as the set of vertices in B' adjacent to some vertex in A' . Compute H = 
(Ai,Bi,Ei), the bipartite subgraph induced by the vertex sets Ai and Bi. 

4- The required vertex separator set S is a minimum cover of H . It separates G 
into subgraphs with vertex sets A = A' \ S and B = B' \ S . 



Definition 2.16 A minimum cover is a cover of minimum cardinality. 

Associated with the notion of minimum cover is that of a maximum matching, 
which requires another definition. 

Definition 2.17 A matching is a subset of E, such that no two endpoints in this 
subset have the same vertex. 



Definition 2.18 A maximum matching is a matching of maximal cardinality. 
Maximum matchings and minimum covers are dual concepts, and this is further 



discussed in S2.5. 



Several notes arise in regard to this algorithm: 

1. It must be recognised that the ideal aim is to find an S of absolute minimal 
cardinality, however this algorithm only finds S as small as possible in the con- 
text of the given edge separator set Ei . In general, the problem of finding the 
smallest possible S appears to be a combinatorially-explosive one that is not 
achievable by any algorithm efficient enough to be worth considering. In con- 
solation, demonstrate that the Spectral Partitioning algorithm generally 
finds a smaller S than its competitors. 

2. The problem of finding a minimum cover is non-trivial, and much research 
into it has been performed. Pothen et al. | pT| use a "maximum matching" 



technique (see §2.5), that is guaranteed to give the minimum cover for the 
given edge separator set. The actual implementation in this report uses a 
heuristic pro cedu re to c alculate an approximate minimum cover, described 



in Algorithm 2_^ in §2^. This procedure is not guaranteed to give a minimum 
cover. 



3. Algorithm 2.2 requires the use of a function to determine the median of a 



list of numbers. An algorithm to do this efficiently is a special case of an 
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algorithm to select the kth smallest component of a list of numbers. Page 129 
of describes an algorithm to do this in O(nlogn) time, by recursively 
partitioning the list. 

4. The problem of calculating the second eigenvector of the graph Laplacian 
is also non-trivial, and generally computationally expensive. It depends on 
the prior calculation of the second eigenvalue. The most efficient algorithm 
to find extremal (largest and smallest) eigenvalues of sparse matrices is the 
Lanczos algorithm. Unfortunately, the procedure is subject to severe numerical 
problems that make implementation complex, but in practice it is the only 
real choice. Implementation of the Lanczos algorithm is discussed in more 
detail in 



2.5 Minimum Cover 



This section deals with the problem of finding a minimum cover (see Definition 
2.16) of a bipartite graph. This problem occurs as a necessary component of the 
Spectral Partitioning algorithm (Algorithm |2.2| ) ~ it is desired to find a minimum 
cover of H = {Ai,Bi, Ei). One method of doing this is mentioned in but this 
has not bee n impl emented due to time constraints, and instead a heuristic procedure 
is followed. § 2.5.l| discusses references in whic h are found minimum cover algorithms 



(for both bipartite and general graphs) , and § 2.5.2 describes the heuristic procedure 
(for bipartite graphs only) actually implemented by me. 



2.5.1 True Solution 

The minimum cover problem has been solved in a number of ways. The earliest 
algorithms for the minimum cover of a bipartite graph are based on the "Dulmage- 
Mendelsohn decomposition" |l^, but these references are not particularly read- 
able. A more general result, for the minimum cover of any graph is found in [ p^ , 
but for the purposes of this partitioning, it is better to only consider algorithms for 
bipartite graphs, to maximise efficiency. For the rest of this section, consider the 
term graph to mean the bipartite graph H = (yli, Bi, i?i). Also define a = 
b~ \Bi\,e~ \Ei\ and n — a + b. 



As mentioned in §2^, the minimum cover of a bipartite graph is the dual of 
the maximum matching, although the details of this relationship are not explicitly 
supplied. Pothen ct al. cite a further paper |^ that details an algorithm for 
a maximum matching. A simplified description of this algorithm is found in fl^ , 
pp221-227], and is reported to solve the matching problem in 0(min(a, h).e) time. 

An alternative approach is presented in pages 495-499 of and deals with 
the matching problem in terms of a problem in "flow maximisation" . The algorithm 
described is quoted as requiring 0{n[e + n)logn) time (or 0{tv') time, for dense 
graphs). It is expected that H is in general sparse, but this algorithm does not 
appear to be as efficient as that used in pT[ . 

Implementation of a true minimum cover algorithm is left as a future exercise, 
see §|. 
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2.5.2 Heuristic Solution 



The heuristic procedure which I have used is formally described in Algorithm 2.3. 

Input data is an edge separator set Ei, generated by the vertices whose eigen- 
valuation crosses the median eigenvalue. Construct d, a listing of the degrees of the 
vertices in H — AiU Bi, and then, whilst any element in d is positive, perform the 
following procedure: Find the current vertex of maximum degree in d, add it to S, 
and remove it and the edges adjacent to it (in Ei), from d. Repeat until no edges of 
degree > remain in d. Setting dj to ensures that the vertex will not be considered 
again. Implementation is improved by the very cheap process of comparing the size 
of the resultant S with the sizes of Ai and Bi , and if S is larger than the smaller 
of these, S is replaced by the smaller one. Since no elements are removed from S, 
it can be implemented as a simple list or array. 

Algorithm 2.3 (Approximate Minimum Cover Algorithm) Given Ei, the 
edge separator set of edges of a bipartite graph on vertex sets Ai and Bi, find a 
vertex (separator) set S such that each edge in Ei is incident to a vertex in S, with 
\S\ "small". 

for ie AiUBi 

di ^ degree of i in Ei, 
while some > 

choose jsuch that dj — maxd^; 

dj ^ 0; 

for each iadjacent to j in Ei 
di ^ di - 1; 

if \S\ > \A,\ 

S^Ai- 
if \S\ > \B,\ 

S^B,; 



This strategy will work to varying degrees with different examples. In practice, 
it would be possible to create a (possibly pathological) example for which this 
algorithm would create an unreasonably large S. For sparse graphs of large diameter, 
it is expected that I ^ |^'| and <C so the process for finding S* should 
yield the desired set \S\ ^ n. It is guaranteed only that \S\ ^ min(|^i|, \Bi\). 



18 



Algorithm 2.4 (The Lanczos Algorithm) Given a symmetric A G M"^" and 
w G M" having unit 2-norm, compute a symmetric, tridiagonal matrix Tj e R^^^ 
with the property that X{Tj) C X{A). The diagonal and subdiagonal elements of Tj 
are stored in a{l : j) and (3(1 : j — 1) respectively. 

v = /3o = l; 3=0; 
while Pj 7^ 

if 3^0 

1 

t = w; w = — v; V 

end 

V = V + Aw; i = j + 1; 
aj = w^v; V = V — a^w; fij 

end 



-llvll^; 



2.6 Lanczos Algorithm 

The Lanczos algorithm (originally attributable to |l^) is an efficient method of 
finding the extremal eigenvalues of a sparse matrix. As the Spectral Partitioning 
algorithm is to be applied to sparse matrices, the Lanczos algorithm is required in 



its implementation. It is listed formally in Algorithm 2.4, copied almost verbatim 
from Chapter 9 of 

Practical implementation of the Lanczos algorithm almost always requires some 
reorthogonalisation. For the purposes of the Spectral Partitioning algorithm, we 
need only reorthogonalise against e (the rt- vector of ones) , as the subspace required 
must be orthogonal to this. Rounding error can be avoided by the following strat- 
egy: When computing y = Ax, instead of simply returning y, we will return y 
orthogonalised against e: 



e"'"y 

y ^ y e. 

n 

That is, return (/— e^e/n)Ax instead of Ax, as well as starting with Xq -L e. The 
latter requirement is satisfied by the choice of (xo)j = i — (rt + l)/2, i = 1, . . . , n, 
a choice recommended in l2ll. 
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2.7 Recursive Decomposition 

It is possible, and may be necessary for many applications, to be able to repeat 
the decomposition process on the subcomponents A and B of N. Once the Spectral 
Partitioning algorithm is implemented, it can be recursively called, until the sub- 
graphs A and B are of a certain manageable "atomic" size (about 3-10 vertices), 
yielding a complete permutation of the vertices in N, via the procedure described 
in Algorithm pTs] . 

Algorithm 2.5 (Recursive Decomposition Algorithm) Given a graph G — 
{N,E), on the set of vertices N = {1, . . . ,n}, decompose N into a permutation of 
itself such that the smallest size units in the permutation are no larger than p, the 
"atomic" size. 

function partition{N) 

Apply the Spectral Partitioning Algorithm to partition N into AU B U S; 
if \A\ > p 

A ^ partition{A); 
if \B\ > p 

B ^ partition{B)\ 
return {A\B\S) 
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3 Implementation 



The algorithms presented in §^ are implemented as a small collection ol functions 
and programs in ANSI-standard C that are interfaced with a C software library 
(meschach [|3|). They were written within a UNIX (BSD4.3) environment. Ap- 
pendix ^ provides a listing of the source code, although the functions and data- 
structures in MESCHACH that are referenced are not explicitly listed, and the reader 
is referred t o p3[ . 

Sections p.l| to 3.5 describe the operation and application of each of the pro- 
grams/functions listed in Appendix |^. Currently, there is no documentation on 
these codes apart from this section and comments in the relevant source code. §^ 
discusses the results of applying these functions to the sample graphs described in 
Appendix H. 



3.1 The Function decomp 

The function decomp is the primary partitioning (and recursive decomposition) 
routine written. Input/output parameters are: 

1. spjnat *L: Pointer to the sparse matrix of the graph Laplacian of the graph 
G that we wish to partition, of size n. Unchanged on exit. 

2. PERM *P: On entry, a pointer to a permutation (list) of length n of the actual 
numbers of the vertices in the set being partitioned. P is returned as the 
permutation of itself corresponding to the partition. 

3. PERM *A, PERM *B, PERM *S: The components of the partition. Currently 
not actually relevant, but will be used in future developments, where the 
structure of a recursive decomposition will also be returned. These are pointers 
to permutations of size n on entry, containing dummy data. Returned as 
correctly-sized and filled permutations. 

4. int rec_lvl: decomp is designed perform one of two types of tasks: 

(a) Partition a sparse graph once. 

(b) Recursively repeat this until the units involved are of size less than a 
nominated (currently hard-wired into decomp as 3) "atomic" size. P is 
returned as the permutation which will be most useful for subsequent 
factorisation. This is not directly useful, at present, as there is no record 
of the structure of P - again, this is left as future work, and is discussed 
in§|. 

rec_lvl is used to tell decomp which of these 2 tasks to perform. If rec_lvl 
is set to -1 by a driver program, decomp will only partition the graph once, 
any other choice allowing it to recursively partition the set until satisfied. The 
parameter is used as a record of the depth of recursion by decomp, so a driver 
program will typically only ever set rec_lvl to or —1. 

Currently, decomp is called by the driver program testdc (see j j3.5| ), and the 
parameter rec_lvl is set by testdc according to one of its input (argv) arguments. 
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3.2 The Function mk_sp_graph 



A function to generate random sparse graphs. It is not usually used for testing 
decomp, as the graphs generated are possibly not connected, and are not of small di- 
ameter, which means that we cannot expect small separator sets from them. Future 
versions of this function should generate essentially planar graphs of large diameter, 
which we could expect to partition into sets with small S. Input parameters are: 

1. unsigned int n: The order of the graph desired. 

2. unsigned int p: The average degree of the vertices. 

The function returns a pointer to a spjnat, and is called by testdc, a driver 
program for decomp. 



3.3 The Function select 

This function selects the kth smallest element of a series of (real) numbers, stored 
in a VEC *, (a pointer to a VEC). It is an implementation of an efficient algorithm 
in page 128]. In particular, it can be used to determine the median element of 
a series of numbers. 

Input parameters are: 

1. VEC *y: The vector involved. 

2. int k: select returns the value (not the position) of the kth smallest element 
of y. 

select returns a double, decomp calls it to find the second smallest element 
in the vector of "eigenvalues" returned by trieig. The function is also called by 



decomp to determine the second smallest eigenvalue in a list (see Algorithm 2.2). 



3.4 The Program gr_lap 

This program generates sparse matrices of graphs associated with solving Laplace's 
Equation on a rectangular 5-point grid. The graphs, not the graph Laplacians are 
generated. gr_lap automatically creates a file of the appropriate name, which can be 
changed for later use. For example (in a UNIX environment), typing the command: 
gr_lap 13 23 

creates a file called "Lap . 13 . 23" in the current directory, which contains (in stan- 
dard format for meschach to read), the sparse graph related to solving Laplace's 
Equation on a 13 x 23 grid. 
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3.5 The Program testdc 

This program is the main driver written for decomp, and is called with a series of 
(argv) input parameters. For example (in a UNIX environment), typing the com- 
mand: 

testdc 157 7 13 -1 
calls testdc, and tells it that we wish to use a graph on 157 vertices, with an average 
vertex degree of 7, and to only print out intermediate results of data-structures of 
size 13 or less, testdc will then prompt the user to select either one of a range 
of graphs stored in the current directory, or generate a random sparse graph, for 
partitioning. If the user chooses to input a graph that does not have the correct 
dimension, testdc will abort. This stupid-seeming system is also intended to allow 
the user to instruct testdc to generate random sparse graphs (using mk_sp_graph), 
with parameters n — 157, and p = 7. Thus, the user is expected to have some 
knowledge of the database of sparse graphs before using testdc. The system will 
be improved in subsequent versions of testdc and decomp. 



The final parameter is the value of rec_lvl (see §3.1) that we wish to initially 
pass to decomp. If set to -1, decomp is instructed to only partition the graph once. 
Any other value, or omission of it instructs decomp to recursively decompose the 
graph into atomic subunits. The atom-size is currently hard-wired into decomp (as 
3), but could become an input parameter in future versions, testdc currently calls 
select and decomp, as well as numerous subroutines from MESCHACH. 
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4 Results and Problems 



This section details some results obtained by 

1. Partitioning of all the graphs listed in Table || (page |3^ in Appendix ^). 

2. Full recursive decomposition of the 5 smallest ones. 

It refers to the example graphs listed in Appendix ^. These graphs were used in 
the development of the programs, and have been used to illustrate of the progress 
of the algorithms. There is only one picture of a grid graph - for Ishmail, on 55 
vertices. The other grid graphs are too large to depict. The successful partitioning 



encountered are discussed in 



of these graphs is mentioned in §4.1, recursive decomposition in §4.2, and problems 



Direct comparison with other published results (in particular is not possible, 
as the computers involved are of very different speeds. The test problems that 
have been used are largely from the Boeing-Harwell sparse matrix test problem 
library Access paths to this database were discovered too late for it to be 
used. Other implementational problems, such as the Boeing-Harwell database being 
stored as a column-oriented data-structure, made the application of the partitioning 
algorithm to this suite currently infeasible. (meschach deals primarily in terms of 
row-oriented data-structures.) 



4.1 Partitioning 

decomp, driven by testdc, has been used to correctly partition all of the graphs 
listed in Table The times (in CPU seconds) taken to do this on the University 
of Queensland's Mathematics Department Pyramid 9810 computer (operating un- 
der UNIX BSD4.3) are listed in Table The Pyramid 9810 is benchmarked by a 
LiNPACKp] routine at approximately 0.5 Mflop s^^. The computer used by Pothen 
et al. is a gray y-MP,^ and is expected to be benchmarked about 2 orders 
of magnitude faster than the Pyramid 9810. Results are roughly comparable with 
those in |2l], by incorporating a scaling factor of about 100. 

All but one of the separator sets listed in Table |l| are of the minimum possible 
size. The separator set of size 101 for Schlomo could have been improved (to size 
61), by changing tolerances used in decomp, but this would have required prohibitive 
amounts of memory. 



4.2 Full Recursive Decomposition 

The full recursive decomposition was successfully completed for the graphs Idit, 
Moshe, Itzchak, Shimuel, Ishmail, and Yacov, finding permutations of the vertices 
that were traced to be exactly what would be expected for the first four cases, 
decomp was allowed to run on Ishmail and Yacov until it had completed a full re- 
cursive decomposition - this ran to 6 levels of recursion for Yacov. Results appeared 
correct, although were not manually checked explicitly! The full recursion on the 
other (very large) problems was not attempted. 

^LINPACK is available through the netlib electronic software library 
^Trademark of CRAY Research 
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Name 


n 


1^1 


S 


CPU time (s) 


Hit 


7 


1 


3 


0.0 


Moshe 


20 


2 


9, 10 


0.1 


Itzchak 


31 


1 


30 


0.2 


Shimuel 


32 


2 


7, 11 


0.1 


Ishmail 


55 


5 


5, 16, 27, 38, 49 


0.1 


Yacov 


105 


5 


9, 31, 52, 73, 94 


0.3 


Yair 


121 


11 


59-69 


0.2 


Arieh 


505 


5 


50, 151, 252, 353, 454 


8.1 


Aaron 


2121 


21 


50, 151, 252, 2070 


22.0 


Schlomo 


6161 


101 


3079-3179 


44.4 


Shimshon 


6400 


80 


3120-3199 


57.8 



Table 1: Times (in CPU seconds) and \S\ for decomp to partition graphs listed in 
Table |. 



Caveat: the above comments refer to a previous version of decomp, not the 
one supphed in Appendix which has some minor implementational problems. 
Currently, decomp crashes at some point during recursive calls. 



4.3 Problems 

This section describes the most significant problem encountered in implementing 



the Spectral Partitioning algorithm (Algorithm 2.2 on page |16|). This problem is in 
the choice of m, the order of the tridiagonal matrix T,„ returned by the Lanczos 
algorithm. 

The first impression is that any choice of m will do, the larger the better, in 
estimating A2 accurately. The constraint is that the computational expense is de- 
pendent on m? (amongst other things), and m cannot be increased without limit. 
At the very least, m ^ n would appear to be an obvious, if exaggerated bound. 
There is then, an optimal choice of m, such that: 

1. A2 is found accurately. 

2. Minimal computational expense is involved. 

In fact, empirical observation demonstrates that there is typically a range of 
suitable m, this range varying in position and bandwidth with the problem en- 
countered. It is not possible to accurately predict this range in advance, although 
typically it is located in the vicinity of ^/n. {n is the number of vertices in the 
graph.) 

To make matters more difficult, if m is chosen from outside this range, the com- 
puted value of A2 will be wrong. It appears that if m is too small, A2 will be too 
large, sometimes not even corresponding to larger eigenvalues of the graph Lapla- 
cian. If m is too large, the calculated A2 cannot possibly be an eigenvalue (as it is 
supposed to be the second smallest one!), and with increasing m the returned values 
of A2 typically reduce, suddenly hitting 0, and remaining there. This phenomenon 
is called the "Ghost Eigenvalue" problem and has been studied but apparently 
not yet conquered. 
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Thus, a strategy is needed to choose an optimal m. Empirical observations sug- 
gest that if m is increased from below the optimal position, when the resulting 
computed values of A2 converge, they converge to the correct value. This is not 
necessarily true when decreasing m from above the optimum, but it may be so. 

Several approaches are possible in developing a strategy for choosing m: 

1. Begin with a value of m much larger than is expected to be adequate for 
the given problem. For example, if m = 20 is guessed to be optimum, try 
m — 40. Using this value, run the Lanczos algorithm and investigate the 
resulting A2. As the Lanczos algorithm has returned us Tm, by decrementing 
m and considering T(l : m — 1, 1 : m — 1), it is possible to calculate A2 using 
m — 1. This procedure is repeated until the computed A2 converge. It is not 
expensive, as no new calculations of Tm have to be made. The problems with 
this method are that: 

(a) It is not certain that convergence from above generates the correct Ei. 

(b) Calculation of for an overly large m is expensive. 

(c) The choice of the initial m is problem-dependent. If this choice cannot be 
automated, there is no hope of an automatic sparse matrix factorisation 
technique being derived. The only obvious way to automate this choice 
would be to use m = n, and this would be so expensive as to make the 
whole technique infeasible. 

This strategy has been considered, but was not implemented due to these 
concerns. Its prime advantage is in its ease of programming. 

2. The second method is to begin with a small m, say m — min(n, 3), and always 
maintain m at least this size. Run the Lanczos algorithm using this value of m, 
and return. First calculate the resulting A2, and then the value that A2 would 
have been if m were one less. If these two values are the same, accept m and 
A2, else increment m by 2 and re-run the Lanczos algorithm. This technique 
is suited to calling the Lanczos algorithm as an external function, but suffers 
from being wasteful in its computational requirements, as the Tm calculations 
have to be repeated each time m is incremented. In practice, this technique 
has been experimented with, and works, but has since been superseded by the 
next technique, which is much cheaper to implement, although more difficult 
to program. 

3. Follow a similar strategy to the above, but maintain all the information on 
the Lanczos algorithm as it progresses. In order to do this, a modified Lanc- 
zos algorithm is incorporated into the part of the code that examines the 
convergence of the computed second eigenvalue, so that the need for exter- 
nal function calls is removed. This technique is incorporated into the current 
version of decomp, that appears in Appendix |^. 

If the problem being considered is on n vertices, with average degree fc, and m 
is the actual size of the computed T matrix, then the computational cost of the 
second and third methods can be compared by the following analysis. describes 
the cost of one Lanczos iteration (without reorthogonalisation) as {2k + 8)n flops. 
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For the second method, returning Tj requires {2k + 8)nj flops. If this cost is 
repeated for j = 3, 5, . . . , m, then the total cost is: 

TO (to-1)/2 

^{2k + 8)nj = ^ (2A; + 8)n{2i + 1) w (fc + 4)nm'^/2 flops. 

j=3 i=l 

For example, for the 21 x 101 5-point grid graph Aaron, where fc « 4, n = 2121 
and observation suggests that m = 35 is a good choice, this represents ^ 10^ flops. 

Compare this with the cost for the third method, which is simply the cost of 
the last Lanczos algorithm call of the second (plus some small overhead). This is 
approximately {2k + 8)mn « 2 x 10^ flops, an improvement of order m/4. This 
means, for example, using the Pyramid 9810 computer, running at « 0.5 x 10^ flops 
s~^, calculating the second eigenvector takes 2 rather than 20 CPU seconds. 

In summary, none of the methods described are guaranteed to work, as they are 
all based on the observation that when the computed second eigenvalues converge, 
they converge to the correct value of A2. Thus their philosophy is heuristic, and needs 
refinement, but it appears to work for the examples tested. This is an outstanding 
problem for further work. 
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5 Further Work 



This section is a listing of directions for further work, and includes some commentary 
of the work of 

The incorporation of the Spectral Partitioning algorithm into a complete func- 
tion to factorise and solve large sparse linear systems using nested dissection is a 
primary research target, and would depend on a number of subsidiary goals, which 
are described below. 

1. Problems in d eali ng with the choice of m in the Lanczos algorithm have been 



discussed in §^^. Research is required to establish a more rigorous solution 



to this problem. 

2. decomp should be augmented to become a function that also returns the struc- 
ture of the permutation P. This might be made possible by using a ternary 
code associated with each element of P, to describe the level of recursion 
required to obtain it. This would facilitate a factorisation routine that recur- 
sively decomposed the large sparse matrix into manageable atomic-sized units. 
Implementation would be coupled with an analysis of the performance of this 
technique relative to other methods, such as Gauss-Seidel and Successive Over 
Relaxation. 

3. Currently, there is currently very little error-checking in any of the functions 
and programs that I have written, but this facility is readily implemented 
in the context of the interface with MESCHACH. Input parameters are not 
carefully checked for existence and correctness of dimension, and there is not 
much checking through the codes for other problems, especially numerical 
problems. In part, the latter is due to time constraints, but it is also due to 
the total complexity of the algorithm. Further work is needed to make the 
routines more robust. 

4. More efficient data-structures would improve programming and execution 
speed. In particular, the edge listing H, is currently represented as an \H\ x 2 
array of doubles (MAT *H). (The same is true for the degree listing D.) As they 
are intended to be integers, this means that input, output and comparison op- 
erations on their elements require repeated casts. This is poor programming, 
and consumes more CPU time than required. A replacement system could 
involve simple integer arrays, or more powerfully, it could be a C structure 
containing an array, each element of which points to 2 integers representing 
adjacent edges, modelled after other meschach structures. A possible data- 
structure and referencing description of this system is provided in Figure |[ 

In conjunction with this we could develop a graph-input routine, together 
with a suite of error-checking routines. Currently, the graph is input as an 
adjacency matrix. This could be exchanged for an incidence matrix input, 
decomp could internally convert this to the appropriate adjacency matrix. 
This would improve the ability of the user to input a graph accurately. The 
driver program could be extended to handle a series of different input formats. 



28 



typedef struct edge_elt 
{ 

unsigned int first, second; 
} edge_elt 

typedef struct edge_list 
{ 

unsigned int m, max_m; 
ee *edge_elt; 
} edge_list; 

PERM *decomp( . . . ) 
{ 

edge_list *D, *H; 

/* " (H->ee [i] ) .first" now replaces "H->me [i] [0] " */ 



Figure 1: C Data structure for edge- listing and degree- recording. 



Other possibilities include storing the adjacency matrix of the graph as a 
dense bit array, rather than a sparse integer C structure. This would however, 
make implementation far more difficult, and is not considered to be practical 
for the internal workings of decomp. The programs gr_lap and testdc, which 
involve input and output of graphs (as their adjacency matrices), could be 
made considerably more efficient by this practice. 

5. The nested dissection algorithm lends itself to implementation on parallel 
processing machines, and a long-term goal could be to implement it in such 
an environment. 

6. Pothen et al. ||2l| discuss various estimates for the size of an adequate separator 
set, calculable in terms of various eigenvalues of the graph Laplacian and 
other parameters of the graph. It might be valuable to investigate the use of 
these estimates for some internal consistency checks within the decomposition 
algorithm. 



The implementation of a true minimum cover algorithm, as described in § 2.5.1 , 
would ensure that \S\ is as small as possible. The algorithm presented in 
pp221-227] appears to be the most efficient and accessible choice. It could be 
expected that the increase in computational expense in using this minimum 
cover algorithm would be compensated for by the reduction in computational 
expense (due to the smaller \S\) in a full factorisation routine. 
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6 Listing of Graphic Examples 



This appendix is a listing and discussion of some of the graphic examples used in 
developing the code and algorithms. Names and descriptions of the experimental 
graphs are presented in Table |^. Should these names appear unfamiliar, the reader 
is informed that they are loosely transliterated Hebrew names of various people in 
the Hebrew Bible. There is no particular significance in this choice of names. 



Name 


n 


Description 


Idit 


7 


Simplest case, almost a tree 


Moshe 


20 


Hand-drawn example 


Itzchak 


31 


Almost a tree 


Shimuel 


32 


Simple example 


Ishmail 


55 


5 X 11 5-point grid 


Yacov 


105 


5 X 21 5-point grid 


Yair 


121 


11 X 11 5-point grid 


Arieh 


505 


5 X 101 5-point grid 


Aaron 


2121 


21 X 101 5-point grid 


Schlomo 


6161 


61 X 101 5-point grid 


Shimshon 


6400 


80 X 80 5-point grid 



Table 2: Listing of the experimental graphs. 



The large grid graphs Schlomo (61 x 101) and Shimshon (80 x 80) are used for 
comparison with the results of |2^ . 

6.1 Idit 

Idit is the smallest graph examined (7 vertices, 8 edges), and is depicted in Figure ||. 
Its graph Laplacian is presented in Figure |[ Figures ^ and || are plots of the eigen- 
value spectrum and the second eigenvector (respectively) of the graph Laplacian of 
Idit, as generated by matlab.Q 



3 




12 5 4 



Figure 2: Idit. 



*MATLAB is an (interpreted) matrix computation package, and is a trademark of The Mathworks, 
Inc. 
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Figure 3: The sparse graph Laplacian of Idit. 



Eigenvalue Spectrum of Idit 



4 

index 



Figure 4: The eigenvalue spectrum of Idit. 



2nd Eigenvector of Idit 



4 

index 



Figure 5: The second eigenvector of Idit. 
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6.2 Moshe 



Moshe is a hand-drawn graph made for illustration of the progress of the Spectral 
Partitioning algorithm. It is a planar graph (20 vertices, 32 edges, diameter 7), 
and is depicted in Figure ^. The operation of the Spectral Partitioning algorithm 
(Algorithm 2.2) is described for Moshe in the following paragraphs. 




A' = I , ... , 10 I 



' = {11, ...,21) 



Figure 6: Moshe. 



The initial graph partitioning process partitions A'' = {0, . . . , 19}, into 2 almost 
equal halves, A' = {0, ... ,9} and B' = {10, . . . , 19}, with an edge separator set of 
size 3: Ei — {(8, 10), (9, 10), (9, 11)}. Ai is the set of vertices in A' that are adjacent 
to vertices in B' , and vice- versa. Inspection shows that the relevant bipartite graph 
on El consists of Ai = {8, 9} and Bi = {10, 11}. 

Using only Ai, Bi and Ei, we seek to calculate a vertex separator set S, as a 
subset of the vertices in Ai and Bi, such that all edges in Ai and Bi are incident 
upon at least one vertex in S. We would like S to be of minimum cardinality. 
Whilst this may in general be a non-trivial problem, in this case inspection shows 
that {9, 10} , {8, 9} and {10, 11} are minimum covers, the first case corresponding 
to Algorithm |2.3| accepting the initial choice of S, the others corresponding to the 
use of Ai or Bi as the cover. 

Lastly, fiiidA^ A'\S and B = B'\S. Use of S" = {9, 10} gives A = {0,...,8} 
and B = {11, . . . , 19}. This choice is depicted in Figure |^. The resultant set of cut 
edges are drawn with dashed lines. 



S = (IO, 11] 




A=(0, ... ,9] 



B= |12,...,21} 



Figure 7: Moshe, after partitioning. 



Figures ^ and ^ are plots of the eigenvalue spectrum and the second eigenvector 
(respectively) of the graph Laplacian of Moshe, as generated by matlab. 
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Eigenvalue Spectrum of Moshe 



2 4 6 



10 12 14 16 18 
index 



20 



Figure 8: The eigenvalue spectrum of Moshe. 



2nd Eigenvector of Moshe 




Figure 9: The second eigenvector of Moshe. 
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6.3 Itzchak 

The graph Itzchak is presented in Figure Its adjacency matrix is quite illus- 
trative, and is presented in Figure Figures |l3| and 14 are plots of the eigen- 
value spectrum and the second eigenvector (respectively) of the graph Laplacian of 
Itzchak, as generated by matlab. 



Figure 10: The adjacency matrix L of Itzchak. 
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6.4 Shimuel 



Shimuel (Figure 12) is another example on a small number of vertices, also used for 
development purposes. It is set up to find the separator set = {11, 20}. Figures |l^ 
and |l^ are plots of the eigenvalue spectrum and the second eigenvector (respectively) 
of the graph Laplacian of Shimuel, as generated by matlab. 
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Figure 12: Shimuel. 
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Eigenvalue Spectrum of Itzchak 
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Figure 13: The eigenvalue spectrum of Itzchak. 



2nd Eigenvector of Itzchak 







10 



15 20 
index 



25 30 35 



Figure 14: The second eigenvector of Itzchak. 
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Eigenvalue Spectrum of Shimuel 



10 15 20 25 30 35 

index 



Figure 15: The eigenvalue spectrum of Shimuel. 



2nd Eigenvector of Siiimuel 

0.3 1 , , , , , , 1 

0.2- 

0- 

0.1 - o » O 

0.2- 

0.3 1 • • • • • • 

5 10 15 20 25 30 35 

index 

Figure 16: The second eigenvector of Shimuel. 
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6.5 Ishmail 



Ishmail is the only (5-point) grid graph represented in this appendix, and is depicted 
in Figure It is a 5 x 11 gri d (n = 55 vertices). The adjacency matrix of Ishmail 



illustrates that referred to in S2.2, and its form is: 



G = 



P 



where 



P = 



h 
P 

h 



h 
P 

h 



h 
P 

h 



h 
P 



.1 

1.1 

.1.1 

. .1.1 

. . .1.1 

. . . .1.1.... 

1.1... 

1.1.. 

1.1. 

1.1 

1 . 



1 



11 

22 
33 
44 













16 




















27 
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49 











10 

21 

32 
43 

54 



Figure 17: Ishmail. 



Figures and |T9| are plots of the eigenvalue spectrum and the second eigenvector 
(respectively) of the graph Laplacian of Ishmail, as generated by matlab. 

Section 4 (titled "Graph Products") of discusses the expected repetition of 
the second eigenvector of an m x n 5-point grid graph, showing that it can be found 
as the Kronecker (tensor, outer) product of the length-m path graph and an e- vector 
of size n. Examination of Figure [l9| (and Figure |2l]) clearly demonstrates this result. 
Clever exploitation of this property may reduce the overall computational expense 
required for the Spectral Partitioning algorithm, but not significantly, as the expense 
is dominated by the computations that yield A2 (see §4.2). As this property is only 
true for grid graphs, implementation would require decomp to have an extra input 
flag. 
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Eigenvalue Spectrum of Ishmail 
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Figure 18: The eigenvalue spectrum of Ishmail. 



2nd Eigenvector of Ishmail 
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Figure 19: The second eigenvector of Ishmail. 
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6.6 Yacov 



Yacov is a 5 X 21 S-point grid graph on n = 105 vertices, and is not depicted here. 
Figures and are plots of the eigenvalue spectrum and the second eigenvector 
(respectively) of the graph Laplacian of Yacov, as generated by matlab. 



Eigenvalue Spectrum of Yacov 
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Figure 20: The eigenvalue spectrum of Yacov. 



2nd Eigenvector of Yacov 




Figure 21: The second eigenvector of Yacov. 
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7 Code Listing 



As mentioned in §^ the algorithms presented in this report are implemented in 
ANSI-standard C, and interface with meschach, the C software library for nu- 
merical analysis written by David Stewart p3| . The contents of this appendix are 
listings of the source code that I have written for this work. Without access to [ p3[ , 
references to data structures and functions called from meschach will be mean- 
ingless. Listings of the data-structures and functions are not provided here as they 



are quite long. §4.3 mentions that the code written is not necessarily bug-free, and 



currently has very little error-handling capacity. 
7.1 Listing of decomp 

/* decomp */ 

Contents : This file is called "decomp. c", and contains the function 
"decomp" . 

Aim : (Recursively) decompose the vertex set of a (sparse) graph, 
represented by a graph Laplacicin L, into separator sets. 
This function returns a pointer to a permutation, which, 
when applied to L, will decompose it for efficient 
recursive LDL"T factorisation. 

Language : ANSI Stcindard C 

Author : David De Wit (and David Stewart) March 5 - June 19 1991 



#include <math.h> 
#include <stdio.h> 
#include "matrix .h" 
#include "matrix2 . h" 
#include " sparse . h" 
#include "sparse2 . h" 



extern double select (VEC *, u_int) ; 
extern int icmp(u_int *, u_int *) ; 



prt_tol, p; 



#define show_perm(PP) ((PP->size < prt_tol) ? out_perm(PP) :\ 

printf ("Permutation: size : %d\n" , PP->size) ) 
#define show_vec(VV) ((VV->dim < prt_tol) ? out_vec(VV) :\ 

printf ("Vector: dim: %d\n" , VV->dim)) 
#define show_mat(MM) ((MM->m < prt^tol) ? out_mat(MM) :\ 

printf ("Matrix: m: %d by 2\n" , MM->m) ) 
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PERM *decoiiip(sp_iiiat *L, 

{ 

int 



PERM 

VEC 
MAT 

row_elt 
sp_mat 



PERM *P, PERM *A, PERM *B, PERM *S, int rec.lvl) 

i, j, k, 1, n = L->m, tempA, tempB, tempH, 
inA , inB , inAl , inBl , inD , inS , inAdash. , inBdash , 
N_nonzer o_v , max_degree , vmax , imax , ev_good_enougli ; 
medval, L2old, L2new, L2tol = le-04, Rtol = le-01, 
yi , yj , sum, alplia, beta; 

*Adash, *Bdasli, *A1, *B1, *AA, *AB, *AS, *BA, *BB, *BS, 

♦pivot ; 
*a, *b, *c 
*C, *I, *Q 
*elt_list; 
*AL, *BL; 



*d, 
*T, 



*xO, *w, 
*D, *H; 



*y, *z, *resid, *V, *W, *TMP; 



if C!L I I !A I I IBM !S I I !P) 

error (E_NULL, "decomp") ; 
if CL->m != L->n | I P->size != n) 

error (E_SIZES, "decomp"); 

printfC'At top of decomp. rec_lvl: %d\n", rec_lvl) ; 



/* 0. Set up all the required matrix and vector elements. Firstly, initialise 
all those with dimensions fixed, depending on "n".*/ 



Adash = get_permCn); 
Al = get_perni(n) ; 
y = get_vec (n) ; 
resid = get_vec(n); 
V = get_vec(n) ; 
D = get_mat(n*p, 2); 



Bdash = get_permCn); 
Bl = get_perm(n) ; 
xO = get_vec(n) ; 
TMP = get_vec(n) ; 
W = get_vec(n) ; 
H = get_mat(n*p, 2); 



/* Initialise size of variable-size data- structures 1. They are soon resized. */ 



a - get_vec(l) ; 
c - get_vec(l) ; 
I = get_mat(i, 1) ; 
C = get_mat(l, 1) ; 
w = get_vec(l) ; 
pivot = get_perm(l) ; 



b - get_vec(l) ; 
d - get_vec(l) ; 
T = get_mat(l, 1) ; 
Q = get_mat(l, 1) ; 
z = get_vec(l) ; 



/* 1. Use the Lanczos method and a tridiagonal eigendecomposition routine to 
calculate the 2nd eigenvalue and corresponding eigenvector of L. Pothen, et al . 
suggest the choice for the initial xO, and we begin with W = normalised(xO) and 
V = LW. */ 

for (i = 0; i < n; xO->ve [i] = i - (n - l)/2, i++) ; 

L2old = 0; ev_good_enough = 0; 

while ( ! ev_good_ enough) 

{ 

j = 0; 

sv_mlt(1.0/n2(x0) , xO, W) ; 
sp_mv_mlt (L , W, V); 

L2new = L2old + 2*L2tol; beta = 1; 

MMle (j < 2 I I fabs((L2old - L2new) /L2new) > L2tol) 
{ 

If (j*n > 250000) 
■C 

printfC'j.n = 7od.%d = '/A ", j, n, j*n); 

printf ("Using too much memory, cutting out!\n"); 

exit(O) ; 

} 

a = v_resize(a, j); b = v_resize(b, 

if (j > 1) 

b->ve[j-2] = beta; 
Q = m_resize(Q, n, j); set_col(q, W) ; 

/* Store W in Q. */ 

a->ve[j-l] = alpha = in_prodCW, V); 
v_mltadd(V, W, -alpha, V); 
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/* Orthogonalise V relative to e. */ 

for (i = sum =0; i < n; sum += V->ve [i] , i++) ; 

for (i = 0, sum = sum/n; i < n; V->ve [i] -= sum, i++) ; 

beta = n2(V); cp_vec(W, TMP) ; 

sv_mlt(l/beta, V, W) ; sv_mltC-beta, TMP, V) ; 

sp_mv_mltCL, W, TMP); v.addCV, TMP, V); 

c = v_resizeCc, j); d = v_resizeCd, 

c = cp_vec(a, c) ; d = cp_vec(b, d) ; 

/* trieigCc, d, M) takes a tridiagonal matrix with diagonal entries c and 
off -diagonal entries d, finds tlie eigenvalues, and stores then in c. ♦/ 
trieigCc, d, MNULL) ; 

L2old = L2new; L2new = selectCc, 1); 

> 

if (L2new < 0) 
{ 

printf ( "Negative ev ! \n" ) ; exit (0) ; 

> 

/* Set up the T and I matrices, and the w vector. */ 

T = m_resize(T, j, j); zero_mat(T); 

for (i = 0; i < j - 1; i++) 
{ 

T->me[i][i] = a->ve [i] ; 

T->me[i + 1] [i] = T->me [i] [i + 1] = b->ve[i]; 

> 

T->me[j - 1] [j - 1] = a->ve[j - 1]; 

I = m_resize(I, j, j); I = id_mat(I); 

pivot = px_resize (pivot, j); pivot = px_id(pivot) ; 

w = v_resize(w, j); rand_vec(w); 

z = v_resize(z, j); C = m_resize(C, j, j); 

printf C"\nFinding y: j = 7od\n\tlambda2\t\tn2 (resid) \n" , j); 

C = sm_mlt(-L2new, I , C) ; C = m_add(T, C, C) ; 

for (i = 0; i < C->m; i++) 

if (C->me [i] [i] == 0) 

C->me[i][i] = MACHEPS; 
C = LUfactorCC, pivot); 
z = LUsolveCC, pivot, w, z) ; 
w = sv_nilt (1 . 0/n2(z) , z, w) ; 
L2old = L2new; 

L2new = in_prod(w, mv_mlt(T, w, z)); 

/* If the residual of eigenvector/value pair | | L.y - L2.y I |_2 / | | y | |_2 is 
too large , run again ... */ 

y = mv_mlt(Q, w, y) ; 

resid = sp_mv_mlt(L, y, resid); 

resid = v_mltadd(resid, y, -L2new, resid); 
printf C"\t7o25.20g\t\t°/o25.20g\n" , L2new, n2(resid)); 
ev_good_enough = Cn2 (resid) < Rtol*n2(y)); 
if ( ! ev_good_enough) 

xO = cp_vec(y, xO) ; 

> 

printf ("Using lambda2 = %20.15g\n", L2new) ; 

printf ("y\, the 2nd eigenvector of L: ") ; show_vec(y); 

/* Most of the continuous mathematics data structures are no longer needed. */ 

freevecCa) ; freevecCb) ; freevecCc) ; freevec(d) ; 

f reevec CxO) ; f reevec Cw) ; f reevec (z) ; f reeperm(pivot) ; 
freevec(V) ; f reevec (W) ; f reevec (TMP) ; 

freemat(C) ; freemat(I) ; freemat(Q) ; freemat(T) ; 
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/* 2. Calculate Adash, Bdash and H. Adash. and Bdash are the vertex sets 
generated by the partitioning of the vertices of G by H. H is an edge separator 
set of G, found using Fiedler's method, y is the 2nd eigenvector of the 
Laplacian matrix of G, with median value medval. */ 

/* 2.1. Calculate the medicin of the elements of the 2nd eigenvector. */ 

medval = selectCy, Cu_int) ( Cn + l)/2)); 

/* 2.2.1. Set up Adash and Bdash. */ 

for (i = Adash->size = Bdash->size = 0; i < y->dim; i++) 
if (y->ve[i] <= medval) 

Adash->pe [Adash- >size++] = i ; 

else 

Bdash->pe [Bdash- >size++] = i ; 

printf ( "First Pass Adash and Bdash made\n"); 

show_perm (Adash) ; show_perm (Bdash) ; 

/* 2.2. Set up H. Search through the upper half triangle of L, and, for 
each edge encountered, insert it into H only if the eigenvaluation of 
the vertices crosses the median. */ 

for (i = H->m = 0; i < L->m; i++) 
{ 

elt_list = (L->row[i]) .elt; 
yi = y->ve [i] ; 

for (k = 0; k < (L->row[i] ) . len; k++) 
if ((j = elt_list[k] .col) < i) 
< 

yj = y->ve[j] ; 

if (yi <= medval && yj > medval) 

{ 

H->me [H->m] [0] = 1 ; 
H->me [H->m++] [1] = j ; 

} 

else if (yj <= medval && yi > medval) 
{ 

H->me [H->m] [0] = j ; 
H->me [H->m++] [1] = i; 

} 

} 

> 

printf ("First Pass H madeXn") ; show_mat(H); 

/* 2.2.2. If I Adash I - iBdasliI > 1, move enough vertices with components equal 
to medval from Adash to Bdash. Not easy! Also must correct H. */ 

for (i = 0; i < Adash->size && Adash->size - Bdash->size > 1; i++) 
if (y->ve [i] == medval) 

■c 

for (j = 0; Adash->pe[j] != i; ; 
Adash->pe[j] = Adash->pe [ — Adash->size] ; 
Bdash->pe [Bdash->size++] = i; 
for (k = 0; k < H->m; k++) 

for (1=0; H->me[k] [0] == i ft* 1 < 2; 1++) 
H->me [k~] [1] = H->me [H->m~] [1] ; 

> 

f reevec (y) ; 

/* Sort Adash and Bdash. */ 

qsort (Adash- >pe , Adash->size , sizeof (u_int) , icmp) ; 
qsort (Bdash->pe , Bdash->size, sizeof (u_int) , icmp) ; 

printf ("2nd Pass Adash: ") ; show_perm(Adash) ; 

printf ("\n2nd Pass Bdash: ") ; show_perm(Bdash) ; 

printf ("\n2nd Pass H: ") ; show_mat(H); 
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/* 3. Calculate S, a vertex separator set of G. A reminder that H is an edge 
separator set, and Adash. and Bdash. are a disjoint cover of the vertex set of G. 
Al and Bl are the respective subsets of Adash and Bdash that consist of only 
vertices that adjoin edges in H. The two vertex sets A and B are found, such 
that A = Adash \ S, B = Bdash \ S) . Respective edges are not relevant to the 
further progress of "decomp" , and so are not found. S is optimally as small as 
possible, and one is found that is hopefully relatively small, by the better of 
2 algorithms that seem like a quick cind good way; 

1. Collect vertices of reducing degree from Al U Bl until H is covered. 

2. Use all the vertices of the smaller of the two sets, Al and Bl. */ 

/* 3.1.1. Set up D, a listing of the degrees of vertices in H. */ 

for (i = D->m = 0; i < H->m; i++) 
for (j = 0; j < 2; j++) 
{ 

tempH = Cu_iiit) H->nie [i] [j] ; 
for (k = InD = 0; k < D->m; k++) 

if ((u_int) D->me[k][0] == tempH) 
D->me [k] [InD = 1]++; 

if (linD) 
{ 

D->iiie [D->iii] [0] = tempH ; 
D->iiie [D->iii++] [1] ++ ; 

} 



pr intf ( " \nD : " ) ; show_mat (D) ; 

/* 3.1.2. Set up Al and Bl, respective sides of H. */ 

for (i = Al->size = Bl->size = 0; i < H->m; i++) 
{ 

tempH = H->me[i] [0] ; 

for (k = inAl = 0; k < Al->size; k++) 

inAl = inAl || Al->pe[k] == tempH; 
if (linAl) 

Al->pe [Al->size++] = tempH; 
tempH = H->me [i] [1] ; 

for (k = inBl = 0; k < Bl->size; k++) 

inBl = inBl II Bl->pe[k] == tempH; 
if (linBl) 

Bl->pe [Bl->size++] = tempH; 

} 

/* Sort Al and Bl. Not currently useful, but good for the future. */ 
qsort(Al->pe, Al->size, sizeof (u_int) , icmp) ; 
qsort(Bl->pe, Bl->size, sizeof (u_int) , icmp); 

printf("2nd Pass Al: "); show_perm(Al) ; 

printf ("\n2nd Pass Bl : ") ; show_permCBl) ; 

/* 3.2. Establish the permutation of the vertex separator set S, via a grimgy 
but apparently working piece of code. The algorithm used: 

1. Find the vertex of highest degree in D, such that: 

D->me[imax] [0] = vmax; D->me[imax] [1] = max_degree; 

2. Set the degree of this vertex to zero, decrement the number of 
non-zero vertices, and add this vertex to S. 

D->me [imax] [1] = 0; N_non_zero_v — ; S->pe [S->size++] = vmax; 

3. Search out the edges in H containing this vertex, and decrement the 
vertices adjacent to it in D. 

4. Repeat until no non-zero degree vertices remain. */ 

N_nonzero_v = D->m; S->size = 0; 

while (N_nonzero_v > 0) 

{ 

for (i = max_degree = 0; i < D->m; i++) 
if CD->me[i][l] > max_degree) 

max_degree = (u_int) D->me[imax = i] [1] ; 

D->me [imax] [1] = 0; N_nonzero_v — ; 

S->pe [S->size++] = vmax = (u_int) D->me [imax] [0] ; 
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for (i = 0; i < H->iii && N_iioiizero_v > && max_degree > 0; i++) 
for (j = 0; j < 2 && N_iioiizero_v > && max_degree > 0; 
if ((u_int) H->me[i][j] == vmax) 

{ 

k = (u_lnt) H->me[l] [! j] ; 

for (1 = 0; 1 < D->m ktc (ii_int) D->me [1] [0] != k; 1++) ; 

if (D->me [1] [1] > 0) 

{ 

D->me[l] [1] — ; 
max_degree — ; 

if ((u_int) D->iiie[l] [1] == 0) 
N_iionzero_v — ; 

> 

> 

} 

/* Sort S. */ 

qsort CS->pe , S->size, sizeof Cu_int) , icmp) ; 

printf ("\nFirst pass S: ") ; show_perm(S) ; 

/* 3.3. Test to see if the S that has been created is larger than the 
smaller of Al and Bl . If it is, replace S with this smaller set. */ 

if CS->size > Al->size) 

{ 

S = cp_permCAl, S) ; S->size = Al->size; 

} 

if (S->size > Bi->size) 
< 

S = cp_perm(Bl, S) ; S->size = Bl->size; 

> 

printf ("XnSecond pass S: "); show_perm(S) ; 

/* 3.4. Create A, B, such that A, B and S are a disjoint cover of N. */ 
for Ci = A->size = 0; i < Adash->size; i++) 
{ 

k = Adash->pe [i] ; 

for (j = inS = 0; j < S->size; 

inS = inS II S->pe[j] == k; 
if (!inS) 

A->pe [A->size++] = k; 

> 

for (i = B->size = 0; i < Bdash->size; i++) 
{ 

k = Bdash->pe [i] ; 

for (j = inS = 0; j < S->size; j++) 

inS = inS II S->pe[j] == k; 
if (!inS) 

B->pe [B->size++] = k; 

} 

/* Sort A and B. */ 

qsortCA->pe, A->size, sizeof Cu_int) , icmp) ; 
qsort (B->pe , B->size, sizeof Cu_int) , icmp) ; 



/* Print out the vital statistics of the whole process. */ 
printf ("\nSizes of various elements : \n\n") ; 



printf ("\t# Adash, Bdash: 
printf ("\tLambda2: 
printf ("\t# D: 
printf("\t# H: 
printf ("\t# Al and Bl; 



7od\t7od\n" , Adash->size 
7.f \n" , L2new) ; 
7d\n" , D->m) ; 
%d\n" , H->m) ; 
%d\t%d\n" , Al->size 



Bdash->size) ; 



printf ("A 
printf ("B 
printf ("S 



Bl->size) ; 
show_perm(A) ; 
show_perm(B) ; 
show_perm(S) ; 



printf ("jK***JK***>K***>K***JK***JK***>K******************\n\n") ; 

/* Dump unneeded data-structures here! */ 

freeperm(Al) ; freeperm (Adash) ; freemat(D) ; 

freeperm(Bl) ; freeperm (Bdash) ; freemat(H) ; 
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/* 4. Recursively partition tlie sets A and B by calling "decomp" on them. */ 

/* 4.1. Decompose A and B if their sizes are > 3. For example, for A, a sparse 
matrix AL is created (found by extracting the parts of L with indices in A) , 
then decomp is called with AL as L, A as P, and dummies AA, AB, AS as the other 
parameters. In fact, the routine does not actually need these other parameters, 
but they are used, as they may be required for further programming to also 
return the structure of the final permutation. Required structures are created 
in situ. */ 

if (A->size > 3 && rec.lvl != -1) 

{ 

AA = get _perm(A-> size) ; 

AB = get_permCA->size) ; 
AS = get_permCA->size) ; 
AL = sp_get_mat CA->size , A->size, 3*p) ; 

/* The following 9 lines of _slow_ code (to set the entries of AL) are a waste 
of time, but I haven't yet figured out what else to do. Ditto for BL. In fact, 
it appears that this is the slowest part of the whole algorithm! */ 

for (i = 0; i < A->size; i++) 

for Cj = 0; j < A->size; j++) 
{ 

k = (u_int) sp_get_val CL, A->pe [i] , A->pe [j] ) ; 
if (k != 0) 

sp_set_val(AL, i, j , k) ; 

} 

for (1 = 0; 1 < A->slze; 

sp_set_val(AL, i, i, (AL->row[i]) .len - 1); 

The alternative version uses the elements of A as sorted into increasing order. 
Now we can scan through the rows of L that corresponded to entries in A, and 
know that we only have to check the entries in the row of L up until the column 
number becomes >= row number. It looks something like: 

for (i = 0; i < A->size; i++) 
{ 

k = A->pe [i] ; 

elt_list = (L->row[k] ) .elt; 

for (j = 0; j < (L->roH[k] ) .len; 

{ 

for (1 = inA = 0; 1 < A->size; 1++) 

inA = inA II A->pe [1] == (elt.list [j] ) . val ; 

if (inA) 

sp_set_val(AL, i, j, -1); 

} 

} 

for (i = 0; i < A->size; i++) 

sp_set_val(AL, i, i, (AL->row [i] ) . len) ; 



if ( ! ck_symm(AL) ) 
{ 

printf ("Quitting as AL not synmietric ! \n") ; 
exit(O) ; 

} 

if ( ! ck_sums (AL) ) 

{ 

printf ("Quitting as AL has invalid row sums!\n"); 
exit(O) ; 

} 

decomp ( AL , A , AA , AB , AS , ++rec_lvl) ; rec_lvl — ; 

freeperm(AA) ; freeperm(AB) ; freeperm(AS) ; sp_f ree_mat (AL) ; 

> 
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If (B->size > 3 && rec.lvl != -1) 
{ 

BA = get_perm(B->size) ; 
BB = get_perm(B->size) ; 
BS = get_perm(B->size) ; 
BL = sp_get_niat (B->size , B->size, 3*p) ; 

for (1=0; i < B->si2e; i++) 

for (j = 0; j < B->size; 
{ 

k = (u_int) sp_get_val CL, B->pe [i] , B->pe [j] ) ; 
if (k != 0) 

sp_set_val (BL , i , j , k) ; 

> 

for (i = 0; i < B->size; i++) 

sp_set_val(BL, i, i, (BL->row[i] ) . len - 1); 



If (!ck_syiimi(BL)) 
{ 

printf ("Quitting as BL not symmetric ! \n") ; 
exit(O) ; 

} 

if ( ! ck.sums (BL) ) 
{ 

printf ("Quitting as BL lias invalid row sums!\n"); 
exit(O) ; 

} 

decomp(BL, B, BA, BB, BS, ++rec_lvl) ; rec.lvl — ; 

freeperni(BA) ; freeperm(BB) ; freeperm(BS) ; sp_f ree_mat (BL) ; 

} 

/* 4.2. Map tlie apparent A, B, S to the actual A, B, S, using P, the permutation 
of their actual names . */ 

for (i = 0; 1 < A->size; i++) 

A->pe[i] = P->pe [A->pe[i]] ; 
for (i = 0; i < B->size; i++) 

B->pe [i] = P->pe [B->pe [i] ] ; 
for (i = 0; i < S->size; i++) 

S->pe[i] = P->pe[S->pe[i]] ; 

for (i = 0; i < A->size; i++) 
P->pe[i] = A->pe[i] ; 

j = i; 

for (1=0; i < B->size; i++) 

P->pe[i + j] = B->pe[i]; 

j += i; 

for (i = 0; i < S->size; i++) 

P->pe[i + J] = S->pe[i]; 

return P; 

} 



int iciiip(u_iiit *pl, u_int *p2) 

{ 

return *pl - *p2; 

} 
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7.2 Listing of mk_sp_graph 

/* ink_sp_graph 



/)K ************************************************************* 



Contents : This file is called "mk_sp_graph. c" , and contains the 
function "mk_sp_graph" . 

Aim : Generate a random G, the sparse matrix of a (sparse) 
undirected, unvaluated graph on n vertices with an average 
degree of p (number of edges/vertex). Further, the graph is 
almost planar, and this is achieved by using only a narrow 
bandwidth (q) in the original matrix, followed by a 
scrambling of its indices. 

Language : ANSI Standard C 

Author : David De Wit March 5 - May 27 1991 



********************************************************************* */ 



#include <stdio.h> 
#include "matrix . h" 
#include "sparse .h" 



sp_mat *mk_sp_graph(sp_mat *G, u_int p, u_int q) 

{ 

u_int i, j, n = G->m; 

double temp, limit; 



srand(4) ; 

limit = ((double) p) *2147483648/( (double) n) ; 
for (1=0; i < n; i++) 

for (j = i + 1; (j < n) (j < i + q) ; j++) 

if (randO < limit) 

{ 

temp = sp_set_val (G , i, j, 1.0); 
temp = sp_set_val (G, j, i, 1.0); 

> 

return G; 

} 
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7.3 Listing of select 

/* select 



************************************************************ 



Contents : This file is called "select. c", and contains the function 
"select" . 

Aim : Select the kth smallest entry in vector a. On exit, the 
desired element is in its correct place. In particular, 
calling selectCa, intC(a->dim + l)/2)) finds the median. 

See "Algorithms", Sedgewick (1983), pl28. QA76.6.S435 

Language : ANSI Standard C 

Author : David De Wit March 4 - May 27 1991 



********************************************************************* */ 



#include <stdio.h> 
#include "matrix . h" 



double select (VEC *a, u_int k) 

{ 

int i , j , 1 , r ; 

double t, v; 

VEC *b; 



b = get_vec(a->dim) ; b = cp_vec(a, b) ; 

1 = 0; r = b->diiii - 1; 

while (r > 1) 
{ 

V = b->ve [r] ; i=l-l; j=r; 

do 

{ 

for b->ve[i] < v; i++) ; 

for Cj — ; b->ve[j] > v; j — ); 

t = b->ve [1] ; 

b->ve [i] = b->ve[j]; 

b->ve[j] = t; 
} while (j > i) ; 
b->ve[j] = b->ve[i]; 
b->ve [i] = b->ve[r]; 
b->ve[r] = t; 
if (i >= k) 

r = i - 1; 
if (i <= k) 

1 = i + 1; 

> 

return (b->ve [k - 1] ) ; 
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7.4 Listing of gr.lap 

/* gr.lap 



************************************************************* 



Contents : This file is called "gr_lap.c", and contains the program of 
the same name. 

Aim : Make the graph for solving Laplace's Equation on an M x N 
grid. 

Language : ANSI Standard C 
Author : David De Wit (and David Stewart) May 8 - May 27 1991 

********************************************************************* */ 



#include <stdio.h> 
#include <strings.h> 
#include "matrix . h" 
#include "sparse. h" 

#define index(i,j) (N*((i)-l)+(j)-l) 
main(int argc, char *argv[]) 

{ 

u_int i, j, M = atoi CargvCl] ) , N = atoi Cargv [2] ) ; 

char Htoutname ; 

sp_mat *A; 

A = sp_get_mat(M*N, M*N, 5); 
for (1=1; i <= M; i++) 

for (j = 1; j <= N; 

{ 

if (i < M) 

sp_set_val(A, indexCi, j) , index(i+l, j) , 1) 
if (i > 1) 

sp_set_val(A, iiidex(i,j), index(i-l, j) , 1) 
if (j < N) 

sp_set_valCA, index(i,j), indexCi, j+1) , 1) 
if (j > 1) 

sp_set_valCA, indexCi, j), indexCi, j-1) , 1) 

} 

outname = strcat Cstrcat Cstrcat C'Lap . " , argv[l]), "."), argv[2]); 
sp_fout_matCfopenCoutnaine, "w"). A); 

} 
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7.5 Listing of testdc 

/* testdc 



************************************************************* 



Contents : This file is called "testdc. c", and contains the program of 
the same name. It calls the function "decomp" . 

Aim : Test the function "decomp" by setting-up and solving a 
problem . 



Language : ANSI Standard C 



Author : David De Wit March 5 - June 19 1991 



********************************************************************* */ 



#include <stdio.h> 
#include "matrix .h" 
#include "matrix2 .h" 
#include "sparse. h" 



extern u_int prt_tol, p, q; 

extern PERM *decomp(sp_mat *, PERM *, PERM *, PERM *, PERM *, int) ; 

extern sp_mat *mk_sp_graphCsp_mat *, u_int, u_int) ; 

extern int ck_symm(sp_mat *) , ck_sums Csp_mat *) ; 



main(int argc, char *argv[]) 



u_int 


i, j, k, n = atoi(argv[l] ) , q = 6, nfiles = 11; 


int 


j_idx, rec_lvl = atoi(argv[4] ) ; 


PERM 


*A, *B, *S, *P; 


FILE 


*fp; 


row_elt 


*e; 


sp_row 


*r ; 


sp_mat 


*G; 


char 


*fname [] = {"Idit " , "Moshe" , "Itzchak" , "Shimuel' 



"Ishmail" , "Yacov" , "Yair" , "Arieh" , 
"Aaron" , "Schlomo" , "Shimshon"} ; 



/*. 0. Initialise some structures and constants for the problem. */ 
prt_tol = atoi Cargv[3] ) ; p = atoi (argv [2] ) ; 

A = get_permCn) ; B = get_perni(n) ; 

S = get_permCn); P = get_perm(n) ; 

P = px_id(P) ; 
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/* 1. Set up problem by randomly generating or reading in a sparse matrix. */ 
printf ("\nEnter a choice for tlie initial graph: \n\n") ; 
printf("\tO: Make a random sparse graph\n"); 
for (i = 0; i < nfiles; i++) 

printf ("\t%ld: Read file \"%s\"\n", i + 1, fname[i]); 
printf ("\nYour Choice : \n") ; scanf ("V»d" , &i) ; 

if CD 
{ 

printf C"\nReading a sparse graph from \"°/,s\"\n", f name [i - 1] ) ; 
fp = f open(f name [i - 1], "r") ; 
G - sp_f in_mat (fp) ; 

} 

else 
{ 

printf ("\nMaking a sparse graph on y,d verticesX, n) ; 
printfC'at an average %d edges\/vertex . \n" , p) ; 
G = sp_get_mat(n, n, 3*p) ; 
G = mk_sp_graphCG, p, q) ; 

> 

if (G->m != n) 

error (E.SIZES, "testdc"); 



/* 2. Calculate the sparse matrix of the graph Laplacian (L) of the graph 
represented by the sparse matrix G. Defining d[i] as the degree of vertex i in 
G; then L = diagCd) - G. All row and column sums in L = 0. G overwrites L. */ 

/* Ensure diagonal entries are in rows. */ 

for (i = 0; i < G->m; sp_set_val(G, i, i, 1.0), i++) ; 

/* Set the values of the entries . */ 
for (i = 0; i < G->m; i++) 
{ 

r = &CG->row[i] ) ; 

/* scan entries of row r */ 

for (j_idx = 0, e = r->elt; j_idx < r->len; j_idx++, e++) 
if (e->col == i) 

e->val = r->len - 1; /* diagonal entry */ 

else 

e->val = -1.0; /* off -diagonal entry */ 

> 

if (!ck_symm(G)) 
{ 

printf ("Quitting as graph Laplacian not symmetric ! \n" ) ; 
exit(O) ; 

> 

if (!ck_sums(G)) 
{ 

printf C "Quitting as grapli Laplacian lias invalid row suins!\n"); 

exit(O) ; 

> 



/* 3. Call the function to do the partitioning. */ 

printf ("\nCalling decomp from testdc ...\n\n"); 
P = decompCG, P, A, B, S, rec_lvl) ; 
printf ("\nBack in Kansas ... P:\n"); 
if CP->size < prt_tol) 
out_permCP) ; 

else 

printf ("Permutation, size : ZdVu" , P->size) ; 
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int ck_syinni(sp_mat *L) 

{ 

int i ; 

static VEC *x=VMULL, *yl=VMULL, *y2=VNULL; 



if (!L) 

error (E_NULL , "ck.symm") ; 
if (L->in != L->n) 

return FALSE; 

X = v_resize(x, L->m) ; yl = v_resize(yl, L->in) ; 

y2 = v_resize(y2, L->ni) ; 

for (i = 0; 1 < 3; i++) 
{ 

rand.vec (x) ; 

y2 = sp_vm_mlt (L, x, y2) ; 
if (n2(yl) > L->m*MACHEPS) 
return FALSE; 

> 

return TRUE; 

} 



************************************************************ 



int ck_suins(sp_mat *L) 

{ 

int i , j ; 

double sum; 

sp_row *r ; 

if (!L) 

error (E_NULL , " ck_sums " ) ; 
for (i = 0; i < L->m; i++) 
< 

r = «i(L->row[i] ) ; 
for (j = sum = 0; j < r->len; 

sum += r->elt[j] -val; 
if (fabs(sum) > L->m*MACHEPS) 

return FALSE; 

} 

return TRUE; 

} 



yl = sp_mv_mlt(L, x, yl) ; 
yl = v_sub(yl, y2, yl) ; 
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